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most often used. While both techniques produce similar results, the 
variation in procedure used to obtain the data has limited most 
direct comparisons. Because of this, the differences between the 
several methods of making the evaluations have not been adequately 
explored. In this study, three methods of associate (peer) 
evaluation — one rating procedure and two nomination procedures— were 
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concurrent performance measures. All three evaluations yielded levels 
of reliability adequate for use in short-range individual selections; 
all three methods measured the same individual attributes. The 
nomination method was suggested as the clear choice for operational 
use because of the additional benefits of minipal rater resistance, 
ease of scoring, and simple administration. (Author) 
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FOREWORD 



The Leadership Performance Technical Area has among its objectives the identification of 
personal characteristics of performance and their potential for use in the Officer Career 
Management System and the points of application or these imasures in improving current pro- 
cedures used for military school selection, promot nomination, and duty assignment As one 
aspect of the evaluation of these personal characteristics, research is now underway for the experi- 
mental introduction of peer ratings in Officer Basic Courses and is projected for Officer Advanced 
Courses. Peer (or associate) evaluations have in the past been found to be valid predictors of 
future performance (potential) in a number of military situations but must be investigated in the 
new setting in which they are now being applied. 

The enure task is responsive to special requirements of the Deputy Chief of Staff for 
Personnel and the Military Personnel Center, as well as to the objectives of RDTE Project 
2Q16310A755, Manpower System Development. 

The present publication examines the effects of evaluation procedures on psychometric 
properties, reliability, and concurrent validity of associate evaluations. 




UHLANER 
Fechnicat Director 



ERIC 



8 



ASSOCIATE EVALUATIONS: NOMINATIONS VS. RATINGS 



BRIEF 



REQUIREMENT: 

To determine if a nomination procedure of associate evaluations can be substituted for a rating 
procedure. 

PROCEDURE: 

Data were collected on 125 Army officers attending Branch Basic School, Three different 
scoring procedures were used, representing a rating procedure and two nomination procedures. 
Estimations of reliabilities were compared across procedures, and correlations (indices of 
relationships) were compared with a degree-of-acquaintanceship score, a Leadership Battery, and 
school grades. 



FINDINGS: 

The reliabilities of all procedures were ver\' similar, with some indications that the use of too 
many individuals in a nominations technique might lower reliability. With the exception of the 
acquaintanceship scores, there were no differences between techniques in the correlation with 
other scores. The nomination technique with fewer individuals nominated had a significantly lower 
relationship with acquaintanceship. A nomination procedure is found most readily usable. 



UTILIZATION OF FINDINGS: 

The present analysis is the first step in the experimental introduction of associate ratings into 
Army Schools. The use of a nomination technique saves rater time and effort, is administratively 
simple and increases acceptance of associate evaluations. Future research will focus upon the issues 
of reliability across schools, acceptability, feasibility, and validity of associate evaluations. 
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ASSOCIATE EVALUATIONS: NOMINATIONS VS RATINGS 



THE PROBLEM 

The U.S. Army has a long history of using associate evaluations-- 
peer ratings — as. selection and evaluation devices in training situations. 
The best known and most comprehensively researched Army program is the 
"Aptitude for Service Ratings" at the U. S. Military Academy."^ ^ 
Associate evaluations have also been investigated for use in personnel 
selection. Peer ratings in industry have been found capable of predic- 
ting future performance as well as accurately reflecting current perfor- 
mance,^ However, the methods used in obtaining the data have varied 
consiuerably . Two methods are used most frequently: the rating procedure, 
where each member of the rating group assigns every other member a score 
from an evaluation scale, and the nomination procedure, where (^ach 
member of the rating group selects a given number of top and bottom 
individuals in terms of value from the total group. While both techniques 
have produced similar results, they have been used primarily in differf^it 
studies, so that direct comparisons are limited. However, Suci^ found 
that several procedures produced results of about the same reliability 
and recommended the use of nominations because they were easier to 
prepare, administer, and score and less frustrating to the rater. 
Hammer^ also found that rankings and nominations of the same individuals 
produced similar evaluations and recommended the use of nominations. 
However, neither studied the differences between the two techniques in 
their relationship with other measures. 



^Haggerty, H.R. Status report on research for the U.S. Military Academy. 
ARI Technical Research Report 1155- {^^0 452 090) October 1965. 

2 

Tobin, D.J., and Marcum, R.H. Leadership evaluation. Research Report, 
Office of Military Psychology and Leadership, U.S. Army Military Acadeny. 
West Point, N.Y., I967. 

3 

Nadal, Ramon A. A review of peer rating studies. Research Report No. 
6P-8, Office of Military Psychology and Leadership, U. S. Army 
Military Academy. West Point, N.Y.. I968. 

*Suci, G. J. , Vallance, T.R. and Qlickman, A.S. An analysis of peer 
ratings: I. The assessment of reliability of several questio 1 forms 
and techniques used at the Naval Officer Candidate School. Bureau of 
Naval Personnel Technical Bulletin 54-9. Newport, R.T. , 1954. 

^Hammer, C.H. A simplified technique for evaluating basic trai.*ees on 
leadership potential. ARI Research Memorandum 65-10. 1965* 
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OBJECTIVES 



The overall objectives of this study were to investigate the reli- 
abilities of three types of associate evaluations and further to compare 
each evaluation's relationships with other measures of leadership and 
school performance. The specific objectives of the research were to 
compare three types of associate evaluations scoring— one rating proce- 
dure and two nomination procedures — in terms of 1) reliability 2) 
interrelationship of associate evaluation techniques, 5) relationship 
with other leadership measures (i.e., the Officer Evaluation Battery) 
and 4) relationship with concurrent performance measures (i.e. school 
grades). Points 5) & 4) were included to expand knowledge beyond 
results of previous studies. 



METHOD 

Sample Population 

All officers attending a nine week training course (N = I25) were 
used for the study. Almost all were 2nd Lieutenants on active duty only 
for the training period. While some individuals were appointed on a 
rotating basis to student command positions, these positions were pri- 
marily nominal in nature. The officers attended classes approximately 
8 hours a day five days a week and went "home" in the evening. The 
officers were split into 4 platoons the platoon being the evaluation 
group within which leadership choices were made. Once an associate 
evaluation score was produced within a rating group then all individuals 
from all platoons were combined into one group for analyses reported in 
the results section. 



Variables 

Four distinct sets of data were collected. They were: 

Associate evaluations . Each officer was forced to rate individuals 
along a Y-point scale with "equal" numbers in each category (rating 
scale). Each officer was then instructed to select the one officer in 
his platoon who had the highest leadership potential. Next he was 
instructed to select the officer who had the lowest leadership potential 
continuing until I/7 of the group was in the high and 1/7 in the low ^ 
categories (nomination score 1). He then continued with the next highest 
and lowest I/7 (top and bottom two categories^ nomination score 2) and 
again the highest and lowest 1/7; the remaining I/7 was placed in a 
middle category and included the individuals he did not know. 

Experimental diagnostic leadership measures . The Officer Evalxiation 
Battery (OEB) (FT 4954 and FT 4955) was administered to all officers at 
the start of training. The OEB yields seven scale scores: 
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Combat Leadership, Technical-Managerial Leadership, and Career Potential 
with a cognitive (or knowledge factor) and a non-cognitive (or attitu- 
dinal factor) for each, plus Career Intent. (See the ''Manual for Inter- 
preting the Officer Evaluation Batter>'" ® for further explanation of the 
scales and development of the test.) 

Training grades . A variety of evaluative techniques were used by 
the school to measure performance during training. Table 1 lists the 
various evaluations used. 

Acquaintanceship ratings . Each officer was instructed to rate the 
degree to which he was acquainted with ea^' ' ' of his group. Ratings 
were done on a five-point scale (1 = DO ^ iL \ 9 ^ MET ONCE OR TWICE; 
5 = LIMITED CONTACT IN CLASSES; 4 = EXTENSxva CONTACT IN CLASSES; and 
5 = CLOSE PERSONAL RELATIONSHIP) . 

Analysis 

Peer evaluations were scored three different ways. 



1) R = Z J r 



n 



2) Ni=2, rT^ 



n 



3) N^ = ^1 rl^ 



where : 



R = associate rating score 
r = scale scox*^ (1-7) received by a person 
n ^ number of persons giving an evaluation 
N^ = nomination scoie 1 

rT^ = scale score transformed as follows: 

1 = 1; 2, 5, 4, 5, or 6 = 2; and 7 = 5 



N^ = nomination score 2 



rT, 



2 = scale score transformed as follows: 

1 or 2 == 1; 5, 4, or 5 = 2; and 6 or 7 = 5 



^U.S. Army Research Institute for the Behavioral and Social Sciences. 
Manual for interpreting the Officer Evaluation Battery. Arlington, Va: 
Army Research Institute, 1975. 
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Table 1 



MEANS AND STANDARD DEVIATIONS OF VARIABLES 
(N = 125) 





Variable 


X 


Standard Deviation 



Evaluations 

Rating (R) 4th week 
Rating (R) 8th week 
Nomination (N^) 4th week 

(top & bottom categories) 
Nomination (N^^) 8th week 
Nomination (N^) 4th week 

(top & bottom two categories) 
Nomination (Ng) 8th week 
4th week acquaintanceship rating 
8th week acquaintanceship rating 



?-9T 
5.96 

2.00 
2.00 

2.12 
2.11 

5.57 
5.55 



1.18 
1.16 

.50 

.29 

.56 
.56 

.56 
.54 



OEB 



Combat Leadership (Cognitive) 107.62 

Managerial-Tech Lead. (Cognitive) I07.27 

Career Potential (Cognitive) 109.44 

Combat Leadership (Non-cognitive) I09.O5 
Managerial-Tech Lead. (Non-cognitive) 101.47 

Career Potential (Non-cognitive) I05.95 

Career Intent (Non-cognitive) 95.59 

School Grades 

Maintenance Management 84.12 

Combat Engineer Practical (Lead.) 85. 05 

Leadership Exam 77.4=5 

Night Land Navigation 95.60 

Physical Fitness 79.68 
Leadership, Staff, Intelligence, etc. 89. 50 

Combat Operation 80.99 

Engineer Reconnaissance 85.86 

Combat Engineer Practical (Tech) 8o.48 

Orienteering 89. 12 

Fixed Bridges and Construction 8o.04 

Heavy Construction 79.85 



20.07 
21.80 
20.75 

18.51 
20.81 
16.20 
17.66 



8.08 
8.24 
10.15 

15.64 

5. "7 
7-75 

7.64 

7.89 

6.50 
7.52 

12.64 
15.85 
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Acquaintanceship ratings were converted to scores for an indivual by 
computing his mean ratings. His score then became the degree to which 
he was known by the group as a whole. 

Reliabilities of associate evaluations were estimated by using the 
split-half (group) technique/ where random halves (raters) of the rating 
group were used to produce two separate scores for each individual. 
These two scores from each half were correlated with each other over all 
rating groups. The correlation was then corrected by use of the Spearman 
Brown prophecy formula. The same split of the rating group was used for 
the split-half estimate for all three associate evaluation techniques. 

Product moment correlations were computed between all associate 
evaluation techniques. Zero order correlations were computed between 
each associate evaluation technique and the remaining variables. 
Hotelling's t-tests® for differences between pair-wise correlation 
coefficients were performed. 



RESULTS 

Table 1 lists all variables with their means and standard deviations 
Table 2 presents the split-half reliabilities and inLercorrelations for 
all associate evaluation techniques and for 4th and 8th weeks. The 
split-half reliabilities were quite similar for Ratings and N-j^ nomina- 
tions (top/bottom categories), but Np Nominations (top/bottom two 
categories) were lower, .85. The test-retest reliabilities were very 
high (.90's) for all methods. Finally, there was a high degree of 
relationship between all techniques for one session with slightly 
smaller values across sessions. 

The relationships of associate evaluations with the Officer Evalua- 
tion Battery and school grades are shown in Table 3* One hundred and 
twenty-six pair-wise Hotelling t-tests were computed for differences 
between correlation coefficients for each associate evaluation (R, 
and Ng for the fourth and the eighth week); i.e., six t-tests (R vs N^, 
R vs Ng, and vs Np for the fourth and eighth weeks) were performed 
for each of the 21 vairiables. Three significant differences (p < .05) 
were found (t = 2.55 for fourth week R vs N^, Managerial-Technical 
Leadership, cognitive; t = 2.29 for eighth week R vs N^ Managerial-Tech- 
nical Leadership, npn- cognitive; and t = 2.67 fourth week R vs N^, 
Fixed Bridges and Construction exam). It was recognized that 



''Gordon, L.V. Estimating the reliability of peer ratings. Educational 
and Psychological Measurement . I969, 2^, 305-313. 

^Guilford, J. P. Fundamental Statistics in Psychology and Education . 
(4th ed.) New York: McGraw-Hill, I956. 
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the three tests performed within a week, R vs. N^, R vs. Ng, and N] vs 
N2, were not independent tests, and therefore the p value of .05 was 
inflated slightly to some unknown value. A more exact test was not 
found and the small number of significant differences found would 
indicate that this error did not produce an identifiable distortion in 
the results. 

A different pattern emerged from acquaintanceship, as is shown in 
Table 4. No significant differences were found between Ratings and 
N2 Nominations, but Ni Nominations had a significantly lower relationship 
with acquaintanceship than either Ratings or N2 Nominations (R vs Ni 
4th week: t = 5.64, p < .05; Ni vs N2, 4th week: t = 5.05, p < !o5; 
R vs Ni, 8th week: t = 5.86, p < .05; and Ni vs N2, 8th week: 
t = 5. 16, p < .05). 



Table 4 

CORRELATION OF ASSOCIATE EVALUATIONS WITH ACQUAINTANCESHIP 

Acqua intanceship 



Evaluation 


Fourth week 


Eighth week 


Rating 4th week 


.64 


.51 


Rating 8th week 


.62 


.50 


Nomination (N^^) 4th week 


.54 


.55 


Nanination (Nj^) 8th week 


.57 


•59 


Nomination (Ng) 4th week 


.65 


.55 


Nomination (Ng) 8th week 


.61 


.55 



CONCLUSIONS 

The three methods of scoring the associate evaluations yielded 
comparable levels of reliability which were high ^enough to justify their 
use for individual selection purposes. There was some indication that 
the use of large numbers of individuals in high and low categories 
(N^ Nomination technique) might yield slightly lower split-half 



reliabilities. This would seem to reflect the difficulty of making 
reliable discrimination for the category 6 and 2 individuals plus the 
dilution of this information into category 7 and 1 individuals. Further- 
more, the correlations of each technique with other leadership measures 
(OEB and school grades) and technical training grades indicated that the 
associate evaluation techniques were all measuring the same things. 
This was further substantiated by the high degree of interrelationship 
between techniques^ The only difference found was the lower degree of 
relationship between Ni Nominations and acquaintanceship, scores. The 
implication is that the less extreme scores (middle categories) are 
determined more by the degree to which a person is known by individuals 
in the group. This did not, however, affect the relationship with other 
measures. This finding of a lack of relationship between acquaintance- 
ship and performance has been consistent^. 

On the basis of these findings it would seem that nominations, using 
a relatively small number of individuals, can be substituted for full 
rating without any loss in reliability or degree of relationship with 
concurrent performance measures. A potential benefit is a decreased 
reliance on acquaintanceship (friendship/popularity) for the more 
difficult middle category evaluations. 

An assumption made, but yet unproven, is that modifying the instruc- 
tions to the raters to reflect only a nomination technique will not 
change their behavior, i.e., the individuals selected. Research is now 
underway to investigate the results of using a nomination technique 
very similar to the one administered here but with instructions for 
choosing Individuals for only the top and bottom categories. 

If the additional benefits of decreased rater resistance to making 
nominations and the greater ease with which evaluations can be adminis- 
tratively handled and scored are added to the above research findings, the 
nominations (N]^) technique is the clear choice for operational use. Two 
cautions should be added to this generalization. First^ the effect of 
group size was not investigated and there are some reasons to suspect 
that the findings might not hold for smaller evaluation groups. Second, 
the use of associate evaluations as measures of long-term performance 
was not studied and the possibility exists that the evaluation techniques 
could yield different results for these measures. These two potential 
problem areas are now under investigation. 



^Hollander, E.P. and Webb, W.B. Leadership, followship, and friendship, 
an analysis of peer nominationo. Journal of Abnormal and Social 
Psychology , 1955, ^0, 165-167 . 
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